Robust Planning in Domains with Stochastic Outcomes, Adversaries, and Partial Observability

نویسندگان

  • Hugh Brendan McMahan
  • Jeff Schneider
  • Andrew Ng
چکیده

Real-world planning problems often feature multiple sources of uncertainty, including randomness in outcomes, the presence of adversarial agents, and lack of complete knowledge of the world state. This thesis describes algorithms for four related formal models that can address multiple types of uncertainty: Markov decision processes, MDPs with adversarial costs, extensiveform games, and a new class of games that includes both extensive-form games and MDPs as special cases. Markov decision processes can represent problems where actions have stochastic outcomes. We describe several new algorithms for MDPs, and then show how MDPs can be generalized to model the presence of an adversary that has some control over costs. Extensive-form games can model games with random events and partial observability. In the zero-sum perfect-recall case, a minimax solution can be found in time polynomial in the size of the game tree. However, the game tree must “remember” all past actions and random outcomes, and so the size of the game tree grows exponentially in the length of the game. This thesis introduces a new generalization of extensive-form games that relaxes this need to remember all past actions exactly, producing exponentially smaller representations for interesting problems. Further, this formulation unifies extensive-form games with MDP planning. We present a new class of fast anytime algorithms for the off-line computation of minimax equilibria in both traditional and generalized extensive-form games. Experimental results demonstrate their effectiveness on an adversarial MDP problem and on a large abstracted poker game. We also present a new algorithm for playing repeated extensive-form games that can be used when only the total payoff of the game is observed on each round.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Planning with Extended Goals and Partial Observability

Planning in nondeterministic domains with temporally extended goals under partial observability is one of the most challenging problems in planning. Simpler subsets of this problem have been already addressed in the literature, but the general combination of extended goals and partial observability is, to the best of our knowledge, still an open problem. In this paper we present a first attempt...

متن کامل

Hierarchical Task Planning under Uncertainty

In this paper we present an algorithm for planning in nondeterministic domains. Our algorithm C-SHOP extends the successful classical HTN planner SHOP, by introducing new mechanisms to handle situations where there is incomplete and uncertain information about the state of the environment. Being an HTN planner, C-SHOP supports coding domain-dependent knowledge in a powerful way that describes h...

متن کامل

Planning with Nondeterministic Actions and Sensing

Many planning problems involve nondeterministic actions actions whose effects are not completely determined by the state of the world before the action is executed. In this paper we consider the computational complexity of planning in domains where such actions are available. We give a formal model of nondeterministic actions and sensing, together with an action language for specifying planning...

متن کامل

Planning in Nondeterministic Domains under Partial Observability via Symbolic Model Checking

Planning under partial observability is one of the most significant and challenging planning problems. It has been shown to be hard, both theoretically and experimentally. In this paper, we present a novel approach to the problem of planning under partial observability in non-deterministic domains. We propose an algorithm that searches through a (possibly cyclic) and-or graph induced by the dom...

متن کامل

A Framework for Planning with Extended Goals under Partial Observability

Planning in nondeterministic domains with temporally extended goals under partial observability is one of the most challenging problems in planning. Subsets of this problem have been already addressed in the literature. For instance, planning for extended goals has been developed under the simplifying hypothesis of full observability. And the problem of a partial observability has been tackled ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006